Domain Specific Text Processing for Speech Synthesis
نویسنده
چکیده
In Text-to-Speech (TTS) synthesis there are words and expressions that pose problems because some semantic knowledge is required to determine how they should be read out. This work implements a domain filter, a pre-processing module that supports the TTS system by analysing text belonging to a certain semantic domain and rewriting problematic expressions so that they are read out better. The filter was implemented with two techniques, XSLT (a language for transforming XML documents) and regular expressions. They are compared with respect to functionality, performance and usability. No significant differences in functionality are found. The regular expression implementation is found to perform better, but XSLT is easier to use and maintain. Filters were created for two semantic domains and tested on new texts. Their coverage of problematic expressions was good. Drawing on the statistics obtained from filtering new texts, the use of domain-specific filters is discussed. The idea of a general filter is also considered. Supervisor: Jesper Högberg Examiner: Arne Andersson Passed:
منابع مشابه
Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملA multilingual text processing engine for the PAPAGENO text-to-speech synthesis system
Automatic synthesis of speech from arbitrary text requires two basic operations: linguistic analysis of input text and speech waveform generation. The achieved quality of the second stage very much depends on the reliability and richness of information generated in the first stage. In this paper we discuss possibilities and problems of text analysis for multilingual speech synthesis. The langua...
متن کاملAdaptive database reduction for domain specific speech synthesis
This paper raises the issue of speech database reduction adapted to a specific domain for Text-To-Speech (TTS) synthesis application. We evaluate several methods: a database pruning technique based on the statistical behaviour of the unit selection algorithm and a novel method based on the KullbackLeibler divergence. The aim of the former method is to eliminate the least selected units during t...
متن کاملApproaches for adaptive database reduction for text-to-speech synthesis
This paper raises the issue of speech database reduction adapted to a specific domain for Text-To-Speech (TTS) synthesis application. We evaluate several methods: a database pruning technique based on the statistical behaviour of the unit selection algorithm and a database adaptation method based on the Kullback-Leibler divergence. The aim of the former is to eliminate the least selected units ...
متن کاملImproving ASR Recognized Speech Output for Effective Natural Language Processing
The process of converting human spoken speech into text is performed by an Automatic Speech Recognition (ASR) system. While functional examples of speech recognition can be seen in day-to-day use, most of these work under constraints of a limited domain, and/or use of additional cues to enhance the speech-to-text conversion process. However, for natural language spoken speech, the typical recog...
متن کامل